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SUMMARY 


The  condition  number  o  of  a  non-singular  matrix  A 

<P 

is  defined  by  c^(A)  =  <p(A)cp(A-^) ,  where  ordinary  <p  is 

a  norm.  It  is  known  that  for  certain  norms,  the  matrix  AA* 

is  more  "ill-conditioned"  than  A,  i.e.,  c  (A)  <  c  (AA*) . 

<P  <P 

We  prove  that  this  inequality  holds  whenever  the  norm  <p  is 
unitarily  invariant  (<p( A)  is  a  function  of  the  characteristic 
roots  of  AA*)  .  However,  we  show  that  the  inequality  iB 
independent  of  the  usual  norm  axioms.  Some  more  general 
inequalities  are  also  obtained. 


NORMS  AND  CONDITION  NUMBERS 


By 

Albert  W.  Marshall  and  Ingram  Olkin 
Boeing  Scientific  Research  Laboratories  and  Stanford  University 

1.  Summary  and  Introduction. 

The  genesis  of  this  study  is  the  proposition  that  under  pertain 
conditions ,  the  matrix  AA*  is  more  "ill-conditioned"  than  A. 

More  precisely,  the  condition  number  c  (A)  is  defined  for  non-singular 
matrices  A  as 

C<p(A)  =  q>( A)  <p(A-1)  , 

where  ordinarily  9  is  a  norm.  The  statement  concerning  ill-conditioning 
of  AA*  is  the  inequality 

(c)  c  (A)  <  c^AA*)  . 

Where  9(A)  is  the  maximum  absolute  characteristic  root  of  A 

l/2 

and  where  9(A)  =  (tr  AA*)  '  ,  inequality  (c)  was  proved  by  0.  Tausky- 
Todd  [6],  This  raises  the  question  of  whether  (c)  is  true  for  all  norms 
In  this  paper,  we  show  that  quite  the  contrary  is  true;  (c)  is 
independent  of  the  usual  norm  axioms.  However,  we  also  prove  that  (c) 
does  hold  for  a  quite  general  class  of  norms. 

In  the  course  of  proving  these  results,  we  obtain  some  inequalities 
for  symmetric  gauge  functions,  which  may  be  of  independent  interest. 
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2.  Gauge  functions  and  matrix  norms . 

We  call  q>  a  matrix  norm  if 

(al)  <p(A)  >  0  when  A  /  0  , 

(all)  q>(aA)  =  |a|  <p(A)  for  complex  a  , 

(alii)  qp(A+B)  <  cp(A)  +  q>(B)  . 

In  addition  to  these  basic  axioms,  various  other  conditions  are  sometimes 
imposed: 

(alV)  (KE^)  =  1  , 

where  E  is  the  matrix  with  one  in  the  (i,j)-th  position  and 
zero  elsewhere, 

(aV)  tp(AB)  <  qp(A)  <p(B)  , 

(aVl)  <p(A)  =  cp(UA)  =  qp(AU)  for  all  unitary  matrices  U. 

If  <p  satisfies  al,  all,  alii,  and  aVI,  qp  is  called  a  unitarily 
invariant  norm. 

There  is  an  important  connection  between  unitarily  invariant  norms 
and  symmetric  gauge  functions.  A  function  4>  on  a  complex  vector  space 
is  called  a  gauge  function  if 


3 


<bl)  ®(u)  >  0  when  u  ^  0  , 

(bll)  4>(au)  =  |a|  4>(u)  for  complex  a  , 

(bill)  4(u+v)  <  4<u)  +  4>(v)  . 

Often  it  is  convenient  to  assume,  in  addition,  that 
(blV)  3>(e±)  =  1  , 

where  e^  is  the  vector  with  one  in  the  i-th  place  and  zero  elsewhere. 
If,  in  addition  to  bl,  bll,  and  bill, 

(bV)  $(^,...,1^)  =  ,  ...,  e  u±  ) 

1  n 

whenever  =  +  1  and  (i^,  ...,1^)  is  a  permutation  of  (l,  ...,n), 

then  <t>  is  called  a  symmetric  gauge  function. 

It  was  noted  by  Von  Neumann  [7l  that  a  norm  q>  is  unitarily 

invariant  if  and  only  if  there  exists  a  symmetric  gauge  function  $ 

2  2 

such  that  q>(A)  =  4>(a)  for  all  A,  where  a^,  ...,an  are  the 
eigenvalues  of  AA*. 

If  4>  is  a  symmetric  gauge  function  and  u,v  satisfy  u^  <  v^, 
i  =  1, ...,n,  then  it  follows  [5,  p.  85]  that 

I  I 

(2.1)  ®(u1,...,un)  <  «(vL,  •••,\)  • 
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If  t  is  a  symmetric  gauge  function  satisfying  blV,  then  [5,  p.86] 

n 

(2.2)  max  |u  |  <  4>(o. , . .  .,u  )  <  £  |u  |  . 

i  1  i=l 

If  q>  is  the  unitarily  invariant  matrix  norm  determined  by  as  above, 
then  it  follows  that 

n 

£  X  (ABB*A*) 

cp(AB)  _ 1=1  _ _ 

<p(A)  <p(B)  -  [max  X  (AA*)]  [max  X  (BB*)]  -  n  > 

i  J  J 

where  X^(m)  are  the  eigenvalues  of  M.  Thus,  for  any  k  >  n, 
k<p  is  a  unitarily  invariant  matrix  norm  also  satisfying  aV. 

Of  course,  <p  itself  satisfies  alV  (since  $  satisfies  blV),  and 
this  property  is  destroyed  by  the  renormalization. 

J.  The  condition  number  inequality. 

Theorem  j.l.  If  <p  is  a  unitarily  invariant  norm,  then 

(c)  c^A)  <  Cp(AA*)  . 

If  $  is  a  symmetric  gauge  function  which  determines  q>,  then 
we  may  rewrite  (c)  in  the  form 

f(a1,...,an)  *(a~1, . .  .,a~1)  <  4>(a2, . .  .,a2)  0(a‘2,...,cf2)  . 


Thus,  Theorem  5*1  is  a  very  special  case  of 
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Theorem  5.2.  If  4>  is  a  symmetric  gauge  function,  then 
<S(a£, 4>(a^r,  ...,a”r)  is  increasing  in  r  >  0,  where  >  0. 
The  proof  of  Theorem  3.2  is  embodied  in  the  lemmas  below. 


We  say 

(a1,...,an) 

is  ma.lorized  by  (b^, . . .  .b^  )  ,  written 

(a)<(b). 

if  (i)  ax  > 

. . .  >  a  >0,  b,  >  . . .  >  b  >0, 

—  n  '1—  —  n  7 

k 

k 

n  n 

(ii)  Z  a± 

1 

<  I  b, ,  k  = 

1  X 

1,  .  •  .,n-l,  (iii)  =  Z  bj  • 

,1  ,  x 

11  1  1 


Lemma  3- 3*  If  Xa)  (b)  ,  and  4  is  a  symmetric  gauge  function,  then 

(3.1)  <  ®(b^, ...,bn)  > 

(3.2)  ®(a^1,...,a^1)  <  ®(b^X, ... ^b”1)  . 

Proof.  A  proof  of  (3*l)  has  been  given  by  Fan  [ 1 ] ;  by  an  argument 
similar  to  his,  we  prove  (3.2). 

First,  note  that  we  can  assume  for  h  and  j  fixed,  h  <  J, 

(3.5)  =  aft>h  +  (l-a)b^  ,  a^  =  (l-a)bh  +  ab^  ,  a±  =  b±,  i  ^  h,j  . 

That  this  is  true  follows  from  the  fact  that  if  (a)  (b),  then  a  can 

be  derived  from  b  by  successive  applications  of  a  finite  number  of 
transformations  of  the  form  (3*3)  (see  [2,  p.  47]). 
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which  is  true  if  and  only  if 


K  n  k  n  k  n 

z«?  i  i 

1  k+1  0  1  k+1  J  i=l  J=k+1  J 


-  aP 


The  latter  follows  from  Qf^  >  CCj,  i  <  j  .  || 

Observe  that  by  (3.1)  and  Lemma  3*4,  we  have 


l»(c^,...,a®) 

-  2  a* 

In  view  of  (2.2) ,  it  is  perhaps  natural  to  expect  that 


(j-1*)  4  s  — k- 

*(o£, . . 


.  ,ar)  2  of 

’  n  <  _ i 

-,<)  -=<$ 


0  <  r  <  s. 


..  >  a  >  O 

—  n 


for  any  symmetric  gauge  function  4>.  To  see  this  we  need  only  prove 
the  left  hand  inequality,  which  may  be  written  in  the  form 

(3.5)  ®(&S,...,&S)  <  <K(^]r,...,[^]r)  , 

1  1  jL  **1 

and  which  is  a  consequence  of  (2.1). 

An  interesting  counterpart  to  Theorem  3*2  can  be  obtained  from  (3-1*-). 
Theorem  3»5>  If  ®  is  a  symmetric  gauge  function  satisfying  blV,  then 
1 

[•(o£,...,c£)  ]r  is  decreasing  in  r  >  O  whenever  >  0,  i  =  l,2,...,n. 
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1 

Thus  [4>(c^,  4>(a^r,  ...,a^r)]r  is  decreasing  in  r  >  0. 

Proof.  We  have  that 

the  first  inequality  by  blV  and  (2.l).  The  second  inequality  is  (3*5) • 
Thus 


“l  “l  °L  “l 


so  that 


n,rN1r 


°i 


a. 


a. 


The  theorem  now  follows  from  bll.  || 

Theorem  3«5  can,  of  course,  be  specialized  to  yield  a  kind  of 
converse  to  (c). 

Theorem  3*6.  If  cp  is  a  unitarily  invariant  norm  satisfying  alV,  then 


(c*) 


[C(jj(AA*)]^2  <  Cp(A)  . 


Condition  (c*)  can  also  be  obtained  under  somewhat  different 


hypotheses.  In  particular,  if  q>  satisfies  aV,  then 
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c  (AA*)  =  <p(AA*)  <p(  (AA*)-1  )  <  <p(A)  <p(A-1)  <p(A*)  q>(A*-1)  =  c(A)  c(A*)  . 

If  also  q>(A)  =  q>(A*),  then  (c*)  follows.  Of  course,  <p(A)  =  q>(A*) 
if  <p  is  unitarily  invariant. 

4.  Independence  of  the  norm  axioms  and  (c). 

It  is  our  purpose  here  to  show  that  the  condition  number  inequality 
(c)  does  not  follow  from  the  usual  norm  axioms  al  -  aV.  In  fact, 
all,  alii,  alV,  aV  and  (c)  are  independent. 

Remark.  It  has  been  shown  by  Ostrowski  [3]  that  al  is  implied  by 
all,  alii,  aV,  together  with  <p(A)  ^  O,  so  that  al  is  not  included 
in  the  list  of  independent  properties.  Rella  [4]  has  shown  that 
all,  alii,  alV  and  aV  are  independent,  and  we  add  (c)  to  this  list. 

The  results  which  prove  the  independence  of  all  -  aV  and  (c) 
are  summarized  in  the  following  table,  where  +  (-)  indicates  that 
a  property  is  true  (false). 


q>(A) 

all 

alii 

alV 

aV 

(0) 

1 

+ 

+ 

+ 

+ 

(rank  A)  (tr  AA*)1^2  ^ 

+ 

+ 

+ 

+ 

n  max  la.  .  1  , 

+ 

+ 

+ 

+ 

max  a.  . 

+ 

+ 

+ 

+ 

1  ij  1 

*  kJ 

| 

+ 

+ 

+ 

+ 

- 
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The  remainder  of  this  paper  is  devoted  to  proving  the  propositions 
indicated  in  the  table. 

The  results  for  <p(A)  *  1  are  obvious,  so  we  begin  by  considering 

l/2 

q>(A)  =  (rank  A)(tr  AA*)  '  .  In  this  case,  all  and  alV  are  obvious, 

l/2 

and  (c)  follows  from  Theorem  3*1*  since  (tr  AA*)  '  is  unitarily 

l/2 

invariant.  As  is  well  known,  (trAA*)  '  satisfies  aVj  this 

together  with  rank  AB  <  (rank  A) (rank  B)  yields  aV  for 

l/2 

<p(A)  =  (rank  A)(tr  AA*)  '  .  That  alii  is  violated  may  be  seen  by 
taking  A  =  I  and  B  the  matrix  with  a  unit  in  the  (l,l)-th  place 
and  zeros  elsewhere. 

For  9(A)  =  n  max  [a  j  and  max  |a  J  the  first  four  columns 
of  the  table  are  well  known,  and  we  need  only  prove  (c).  Let  e.^ 
be  the  row  vector  with  one  in  the  i-th  position  and  zero  elsewhere. 


Denote  M  1  =  (miJ )  where  M  =  (m  ),  and  let  U  =  AA*.  By  Cauchy’s 


ij 


ij 


inequality, 

apt 


laijl  la“PI  =  le±Ae*{  |eotA_1e*I  <  t  (e±Ue* )  (e^e*)  (eae*)  (epu'Xe*)  ] 


-1 


,1/2 


,  PPxl/2 
=  (uiiu  ) 


Hence, 


max  | a  |  max  [ a0^ [ 


(max  |u  J  max  Ju^l)1/2  , 
i  a 


or 


c«p(A)  <  [c  (AA*)]1^2  . 


Since  U  =  AA*  is  positive  semi -definite. 
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